Reducing Weight Undertraining in Structured Discriminative Learning
نویسندگان
چکیده
Discriminative probabilistic models are very popular in NLP because of the latitude they afford in designing features. But training involves complex trade-offs among weights, which can be dangerous: a few highlyindicative features can swamp the contribution of many individually weaker features, causing their weights to be undertrained. Such a model is less robust, for the highly-indicative features may be noisy or missing in the test data. To ameliorate this weight undertraining, we introduce several new feature bagging methods, in which separate models are trained on subsets of the original features, and combined using a mixture model or a product of experts. These methods include the logarithmic opinion pools used by Smith et al. (2005). We evaluate feature bagging on linear-chain conditional random fields for two natural-language tasks. On both tasks, the feature-bagged CRF performs better than simply training a single CRF on all the features.
منابع مشابه
Feature Bagging: Preventing Weight Undertraining in Structured Discriminative Learning
Discriminatively-trained probabilistic models are very popular in NLP because of the latitude they afford in designing features. But training involves complex trade-offs among weights, which can be dangerous: a few highlyindicative features can swamp the contribution of many individually weaker features, causing their weights to be undertrained. Such a model is less robust, for the highly-indic...
متن کاملLearning a Product of Experts with Elitist Lasso
Discriminative models such as logistic regression profit from the ability to incorporate arbitrary rich features; however, complex dependencies among overlapping features can often result in weight undertraining. One popular method that attempts to mitigate this problem is logarithmic opinion pools (LOP), which is a specialized form of product of experts model that automatically adjusts the wei...
متن کاملStructured Adaptive Regularization of Weight Vectors for a Robust Grapheme-to-Phoneme Conversion Model
Grapheme-to-phoneme (g2p) conversion, used to estimate the pronunciations of out-of-vocabulary (OOV) words, is a highly important part of recognition systems, as well as text-to-speech systems. The current state-of-the-art approach in g2p conversion is structured learning based on the Margin Infused Relaxed Algorithm (MIRA), which is an online discriminative training method for multiclass class...
متن کاملDiscriminative Learning with Markov Logic Networks
Statistical relational learning (SRL) is an emerging area of research that addresses the problem of learning from noisy structured/relational data. Markov logic networks (MLNs), sets of weighted clauses, are a simple but powerful SRL formalism that combines the expressivity of first-order logic with the flexibility of probabilistic reasoning. Most of the existing learning algorithms for MLNs ar...
متن کاملToward Discriminative Learning of Planning Heuristics
We consider the problem of learning heuristics for controlling forward state-space search in AI planning domain. We draw on a recent framework for “structured output classification” (e.g. syntactic parsing) known as learning as search optimization (LaSO). The LaSO approach uses discriminative learning to optimize heuristic functions for search-based computation of structured outputs and has sho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006